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1. Introduction 

Recently, Lindgren, Rue and Lindstrom (2011) derived a link between certain 
Gaussian fields, that can be represented as solutions to stochastic partial dif- 
ferential equations (SPDEs), and Gaussian Markov random fields (GMRFs). 
The main idea is to approximate these Gaussian fields using basis expansions 
J2i Wi<fi(s) where the stochastic weights {wi} are calculated using the stochastic 
weak formulation of the corresponding SPDE. For certain choices of the basis 
functions {ifi}, especially compactly supported functions, the weights form 
GMRFs. Because of the Markov property of the weights, fast numerical tech- 
niques for sparse matrices can be used when estimating parameters and doing 
spatial prediction in these models. This greatly improves the applicability to 
problems involving large data sets, where traditional methods in statistics fail 
due to computational issues. However, the advantages of representing Gaussian 
fields as solutions to SPDEs are not only computational. Using the SPDE rep- 
resentation, non-stationary extensions are easily obtained by allowing spatially 
varying parameters in the SPDE (Lindgren, Rue and Lindstrom, 2011), and the 
model class can be generalized to include more general covariance structures by 
generalizing the class of generating SPDEs (Bolin and Lindgren, 2011). These 
are indeed useful features from an applied point of view as many applications 
require complicated non-stationary models to accurately capture the covariance 
structure of the data. 
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So far these methods have only been used in Gaussian settings, and it has 
not been clear whether they are applicable when the Gaussianity assumption 
cannot be justified. Therefore, this work will focus on extending the SPDE 
methods beyond Gaussianity. A new type of non-Gaussian models that has 
proved to be useful in practical applications is the Laplace moving average mod- 
els (Aberg, Podgorski and Rychlik, 2009, Aberg and Podgorski, 2011). These 
are processes obtained by convolving some deterministic kernel function with 
stochastic Laplace noise. The models share many good properties with the Gaus- 
sian models while allowing for heavier tails and asymmetry in the data, making 
them interesting alternatives in practical applications (see e.g. Bogsjo, Podgorski 
and Rychlik, 2012). One of the motivating examples in Aberg and Podgorski 
(2011) is a Laplace moving average model with Matern covariances. This model 
can be seen as the solution to the same SPDE that generates Gaussian Matern 
field but where the Gaussian white noise forcing is replaced with Laplace noise. 
It has previously been shown that the SPDE model formulation of Gaussian 
Matern fields has many computational advantages compared with the process 
convolution formulation (Bolin and Lindgren, 2009, Simpson, Lindgren and Rue, 
2010). We demonstrate here that for the Laplace moving average models, the 
SPDE formulation can also be used to derive a new likelihood-based parameter 
estimation technique as well as an efficient simulation procedure. 

The structure of the paper is as follows. Section 2 contains an introduction to 
the Matern covariance family and the SPDE formulation in the Gaussian case. 
In Section 3, stochastic Laplace fields are introduced, and some properties of the 
Laplace-driven SPDE model are derived. Subsequently, in Section 4, the Markov 
approximation technique by Lindgren, Rue and Lindstrom (2011) is extended 
to the Laplace model, and its sampling is discussed in Section 5. A parameter 
estimation technique based on the EM algorithm is derived in Section 6, and 
Section 7 contains a simulation study showing that it gives reliable parameter 
estimates. Finally, Section 8 contains a summary and discussion of future work 
and possible extensions. 

2. Gaussian Matern fields 

The Matern covariance family (Matern, 1960) is often used when modeling spa- 
tial data. There are a few different parameterizations of the Matern covariance 
function in the literature, and the one most suitable in our context is 

C(h) = - jlT^a, 2 («||h||r^(«||h||), h£M d , (1) 

where d is the dimension of the domain, v is a shape parameter, k 2 a scale 
parameter, (j> 2 a variance parameter, and K v is a modified Bessel function of 
the second kind of order v > 0. The associated spectrum is 
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As the properties of Gaussian fields are given by their first two moments, the 
standard way of specifying Gaussian Matern fields is to chose the mean value, 
/i(s) possibly spatially varying, and then let the covariance function be of the 
form (1). An alternative way of specifying a Gaussian field on R d is to view it 
as a process convolution 

X(s) = [ fc(s,u)B(du), (3) 

where k is some deterministic kernel function and B is a Brownian sheet (Hig- 
don, 2001). One of the advantages with this construction is that non-stationary 
extensions are easily constructed by allowing the convolution kernel to be de- 
pendent on the location s. If, however, the process is stationary, the kernel k 
depends only on s — u and the covariance function for X is 

C(h) = / fe(u- h)fc(u)du. 

Thus, the covariance function C, the spectrum S, and the kernel k are related 
through 

{2v) d \F{k)\ 2 = F{C) = S, 

where T(-) denotes the Fourier transform. Since the spectral density for a 
Matern field in dimension d with parameters v, <f) 2 , and n is given by (2), 
one finds that the corresponding symmetric non-negative kernel is a Matern 
covariance function with parameters i^k — ^ — j, <Pk — \R>, and k^ = n. 

In yet another setting, Gaussian Matern fields can be viewed as the solution 
to the SPDE 

(k 2 - A)fX(s) =0W(s), (4) 

where W(s) is Gaussian white noise, A = ^f=i ~§~p is the Laplace operator, 
and a = v + d/2 (Whittle, 1963). As discussed in Lindgren, Rue and Lind- 
strom (2011), there is an implicit assumption of appropriate boundary condi- 
tions needed if one wants the solutions to be stationary Matern fields. 

The connection between (3) and (4) is through the Green's function of the 
differential operator in (4) 

i a-d 

G Q (s,t) = —-j—^r -(K||s-t||)^l^W| S -t||), (5) 

that serves as a kernel in (3). It is straightforward to show that G a € L p (R d ) 
if and only if a > L£rl3fj ( se e for example Samko, Kilbas and Maricev (1992) 

p. 538), and in particular a > d/2 guarantees that G a € 7^(18^). 

A non-Gaussian model with Matern covariances could be constructed either 
using the process convolution formulation (3) where the Brownian sheet is re- 
placed by some non-Gaussian process, or through the SPDE formulation (4) 
with non-Gaussian noise. Such non-Gaussian extensions are discussed next. 
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3. Non-Gaussian SPDE-based models 

A simple way of moving beyond Gaussianity in the SPDE model (4) is to al- 
low for a stochastic variance parameter <f>. By choosing <j> as an inverse-gamma 
distributed random variable, the resulting field has t-distributed marginal dis- 
tributions and is therefore sometimes referred to as a t-distributed random field 
(R0islien and Omre, 2006). In a Bayesian setting, this extension can be inter- 
preted simply as choosing a certain prior distribution for the variance, and one 
can of course come up with many other non-Gaussian models by changing this 
distribution. However, models constructed in this way are non-Gaussian only 
in a very limited sense. Namely, every realization of them behaves exactly as a 
Gaussian field with a globally re-scaled variance, and because of this, they are 
all non-ergodic as the parameters in the prior distribution cannot be estimated 
from a single realization of the field. One would prefer a non-Gaussian model 
where the actual sample paths behave differently from a stationary Gaussian 
field, and one way of achieving this is to let the variance parameter be spatially 
and stochastically varying. Both Lindgrcn, Rue and Lindstrom (2011) and Bolin 
and Lindgrcn (2011) explores this option by expressing log0(s) as a regression 
on a few known basis functions where the stochastic weights are estimated from 
data. This was interpreted as a non-stationary Gaussian model, but could also 
be viewed as a, somewhat limited, non-Gaussian model with a slowly spatially 
varying variance parameter 4>(s). To obtain a model which is intrinsically non- 
Gaussian also within realizations, one can draw c/>(s) at random independently 
for each s. The right-hand side of (4) is then a product of two independent 
noise fields. The following non-Gaussian models essentially can be interpreted 
as a formal realization of this idea. 

One interesting type of distributions, obtained by taking a random vari- 
ance and mean in an otherwise Gaussian random variable, are the generalized 
asymmetric Laplace distributions (Abcrg, Podgorski and Rychlik, 2009). The 
Laplace distribution is defined through the characteristic function with para- 
meters /Lt, 7 G M and <r, r > 

ip(u) = e l ~« u (l-ifiu+^u 2 

The distribution is symmetric if /i = and asymmetric otherwise. The shape of 
the distribution is governed by r and the scale by a. The distribution is infinitely 
divisible, and a useful characterization is that if Z is a standard normal variable 
and r is an independent gamma variable with shape r, then 7 + fiT + aVTZ 
has an asymmetric Laplace distribution. 

Stochastic Laplace noise can now be obtained from an independently scattered 
random measure A, defined for a Borel set B in K d by the characteristic function 

/ 2 \ -m{B) 

^A(fl) (u) = e^ B > M -ifiu+ yii 2 ) 

where the measure m is referred to as the control measure of A. This does not 
define Laplace noise in a direct manner, but similarly to how Gaussian white 




David Bolin/ Spatial Matern fields driven by non-Gaussian noise 



5 



noise can be seen as a differentiated Brownian sheet (Walsh, 1986), Laplace 
noise can be viewed in the sense of distributions (generalized functions) as a 
differentiated Laplace field. The most transparent characterization is through 
the following series representation of the Laplace field A(s) on a compact set 
D e M d : 

OO 

A(s)-7s + ^(r fc + G fc Vr^)i(s>Sfe), seA (6) 

fc=i 

where Gk are iid N(0, 1) random variables, s& are iid uniform random variables 
on D, and 

1 if Si > Sk : i for all i < d, 
otherwise. 



l(s > s fc ) 



The random variables Tj. can be written as L^ = e~ vlh Wk where Wk are iid 
standard exponential variables and jk are the arrival times of a Poisson pro- 
cess with intensity 1. Thus, Laplace noise can be expressed as a distribution 
(generalized function) 



T + E^ + ^V 7 ^)^, (7) 



A 

fc=i 



where S Sk is the Dirac delta distribution centered at s^. 

The model of interest is the solution X to the Laplace-driven SPDE 

( K 2 -A)tx = A, (8) 

where both X and A are viewed as random variables valued in the space of 
tempered distributions. To clarify in what way the solution to this equation 
exists, we look at a general SPDE 

(k 2 - A)fJC = M, (9) 

where M is an arbitrary independently scattered i^-valued random measure 
with E(|M(dx)| 2 ) = C dx for some constant C < oo. Examples of such meas- 
ures are the Laplace measures of interest here but also standard Brownian sheets. 
As usual for fractional Laplacian operators (Samko, Kilbas and Maricev, 1992), 
T = (k 2 — A)^ is defined using the Fourier transform through F{T /) = Pf, 
where / is the Fourier transform of the function /, (Vf)(k) — (k 2 + k T k)t_f (k), 
and the operator T is well-defined for example for all / € L p (WL ) for 1 < p < oo. 
The definition applies also when / is a distribution or, more specifically, a 
tempered distribution. Thus, (9) is viewed as an equation for two random 
(tempered) distributions so the equation has to be interpreted in the weak sense 

TX{<p) - M(ip), (10) 



where ip is in some appropriate space of test functions. Now, the action of the 
self-adjoint operator T can be moved to the test function on the left-hand side 
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and (10) can be rewritten in a more explicit fashion as 

X(T(p,u) = J <p(b)M(6b,u). (11) 

Here, we have included the second argument uj € Q to highlight that the sought 
functional X is random, and the equation should hold for u in a certain full 
probability set fio G f2 and universally for each ip. 

To describe the solutions of (9), we need the Sobolev spaces H n of fractional 
order n. These are usually defined using the Fourier transform in the following 
way. Let E be the Schwartz space of rapidly decreasing functions on R d , for 
u € E' (the dual of E, also referred to as the space of tempered distributions), 
define the Fourier transform of u as u{p) = u(tp), where ip is the usual Fourier 
transform on R d of tp G E. Define a norm on E by 

IH| n = / (i + |k| 2 )>(k)| 2 dk 

and let H n be the completion of E in this norm. By Plancherel's theorem, one 
has that Hq = L2(K d ) and one can show that for the special case n G N, H n is 
identical to the classical Sobolev space of L 2 functions with all partial derivatives 
of order n or less in L 2 . The space H_ n is the dual space of H n and does in 
general contain distributions. 

Let us note that the right hand side of (11) in principle may not be defined 
on a full probability set uniformly for all ip. However, one can regularize M so 
that p — > M(ip) is in fact a random distribution. Indeed, since 

E(\M(<p)f) = cj ^(s) 2 ds = C|M| 2 , 

the random linear functional ip — »• M{ip) is continuous in probability on H n for 
any n > 0, and by Theorem 4.1 in Walsh (1986) there exists a version of M 
which is almost surely in iJ_„ for n > d/2. From now on we always assume that 
we deal with such a version. 

Following Walsh (1986), we say that X(-,u) is an i/„-solution of (9) if for a.e. 
uj, X(-,cj) is an element of H- n and (11) holds for every p e H n . In other words, 
we aim at finding a random functional X that almost surely is a distribution 
and satisfies (9) as a continuous functional on H n . The proof of the following 
proposition is similar to the proof of Proposition 9.1 in Walsh (1986) where 
the existence of the solution to the stochastic Poisson equation on a bounded 
domain in M. d was demonstrated. 

Proposition 3.1. Assume that M is an independently scattered L 2 -valued ran- 
dom measure with E(|Af(dx)| 2 ) = Cdx. Then for k > 0, a > 0, there exists a 
random functional X : H n x(l->l such that for a certain set Qq, P(^lo) = 1 
and for all uj G fio and all tp G H n 



X{ip,w) = / G a p(x)M(dx,uj), 



(12) 
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where G a (p(x) = J G Q (s, x)y(s) ds and G a is given by (5). This is the unique 
H n -solution to (9) if n > d/2, and moreover we have X £ H m almost surely for 
m < a — d/2. 

Proof. From the standard theory of fractional differential equations, one has 
that G a maps H n isomorphically onto H n + a (see e.g. Samko, Kilbas and Maricev, 
1992, p.547). Let X be any ff„-solution to (9) and let tp = G a (p. Applying (11) 
to i/j and using that TG a tp — ip one gets that 



Thus this solution also satisfies (12) and the solution is unique if it exists. 
To prove existence, let X be defined by (12) and take <p £ L2(R d ). Then 



where the last inequality follows from that G a maps H n - a — > H n boundcdly for 
a > 0. Thus, it follows that X is a random linear functional that is continuous in 
probability on H_ a . The embedding maps H ni — > H n2 are of Hilbert-Schmidt 
type if m > n 2 + d/2 (see e.g. Example la in Walsh, 1986), and using this with 
n-2 = —a together with Theorem 4.1 in Walsh (1986) one gets that there exists a 
version of X which is almost surely in H~ n ifn > d/2 > d/2 — a. From now on, 
X is such a version and we note that X £ H m almost surely for m < a — d/2. 

What is left to show now is that X with probability one satisfies (11) for 
each if £ H n for n > d/2. To that end, first note that if ip £ H n , then by the 
definitions of T and G a one has 



Let n > d/2 and fix <p £ H n . If M((p, ui) denotes the functional J cp(s)M( ds, lu), 
one has by the definition of X and by the equation above that 



Hence, there is a set £l v c ri with P(f2 v ) = 1 such that for each u> £ tt v one 
has X(T<p,lo) — M(ip,uj). Now, H n is separable, so we can chose a countable 
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base B — {bi\ c *L 1 in H n and define = n^fi^. Then equality holds for each 
/ G B and for each oj G 0,q and P(f2o) = 1 by the countability of B. 

The map (f — > T(f — > X(Tf) of H n — > H n ^ a — » R is continuous since X 
is continuous on H n for n > d/2 — a and T is a continuous map from H n to 
H n _ a . Thus, both A(T-,w) and M(-,ui) are continuous functionals on H n for 
w in some full probability set and equality therefore holds in (11) for each 
ip G H n for uj (z flo — Qq H flo since B is linearly dense in H n . □ 

Remark 1. By similar arguments one can show that the solution X defined 
in Proposition 3.1 also is a solution to the SPDE (9) in the sense that with 
probability one X G E' and (11) holds for every ip € E. This is, however, a 
weaker statement since E = D n H n and E' = U n H n . 

Remark 2. The solution X defined in Proposition 3.1 is in general a random 
linear functional. However, it can be identified with a random function if a > d/2 
since X G H m almost surely for m < a — d/2. Using the relation between a and 
the parameter v in the Matern covariance function, a = v + d/2, we see that 
X G H rn almost surely for m < v. Thus, v acts as a smoothness parameter for 
the solution since the sample paths almost surely will be differentiable if v > I, 
two times differentiable if v > 2 etc. 

Remark 3. The previous remark can be strengthened using the Sobolev em- 
bedding theorem which shows that H n can be embedded in the Holder space 
Cl(R d ) where n- (r + fc) = d/2 and r G (0, 1) (see e.g. Adams, 1975). The space 
C£(R d ) consists of functions such that all partial derivatives up to order k are 
continuous and such that the fcth partial derivatives are Holder continuous with 
exponent r. Thus, if v > d/2, we almost surely have X G C£(R d ) (after possibly 
redefining it on a set of measure zero) where k is the integer part of v — d/2 and 
r = v - d/2-k. 

We now go back to the special case of Laplace noise and since the main 
interest here is ordinary random fields with Matern covariance functions, we 
from now on assume that a > d/2 in (8). One sometimes uses m(A) — 1(A)t, 
where I is the Lebesgue measure and r some constant, as a control measure for 
A. By the definition of the differential operator T, it is then easy to see that 
the spectrum for the solution X is 

V + m 2 ) i 



Rx(k) 



(2ir) d ( K 2 + k T k) 



Thus, the covariance function for A is a Matern covariance of the form (1) with 
cj) 2 = t(<t 2 +[j, 2 ). Since X is Laplace noise convolved with a Green function, which 
also has the form of a Matern covariance function, the model is equivalent to the 
Laplace moving average models in Aberg, Podgorski and Rychlik (2009), Aberg 
and Podgorski (2011). Thus, using Theorem 1 in Aberg and Podgorski (2011), 
the marginal distribution for A(s) is given by the characteristic function 

4>x{u) = exp J ijG a (s, t)u - log ^1 - i^uG a (s, t) + ^-G 2 a (s, t)^j dt 

(13. 
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Figure 1. Marginal distributions of the solution X(s) to (8) in the case of symmetric (left 
panel) and asymmetric (right panel) Laplace noise. 

A few examples of the marginal distributions for symmetric and asymmetric 
cases are shown in Figure 1. 

4. Hilbert space approximations 

To obtain a computationally efficient representation of a Matern field, the Hil- 
bert space approximation technique by Lindgren, Rue and Lindstrom (2011) 
can be used. The starting point is to consider the stochastic weak formulation 
(10) of the SPDE. A finite element approximation of the solution X is then ob- 
tained by representing it as a finite basis expansion X = J^., Witpi(s), where 
the stochastic weights are calculated by requiring (10) to hold for only a specific 
set of test functions {?/>.;, i — 1, . . . ,n} and {<Pi} is a set of predetermined basis 
functions. To simplify the presentation, we first look at the case a/2 S N and 
then turn to the case of a general a > d/2. 

4.1. The case a/2 e N 

To construct the approximation for a = 2, 4, . . ., we first look at the fundamental 
case a = 2. Lindgren, Rue and Lindstrom (2011) then use ?pi = tfi, and one 
then has 

n 
3 = 1 

where (/, g) — J f(s)g(s) ds. By introducing the vector w = (w%, . . . , w n ) T and 
a matrix K with elements = Upi, (n 2 — A)ipj), the left hand side of (10) 
can be written as Kw. Under mild conditions on the basis functions, one has 

((Pi, (k 2 - A)cpj) = k 2 fa, (pj) - fa, Acpj) 

= K 2 (ifi, Ifj) + {Vlfi, Vlfj) . 

Hence, the matrix K can be written as the sum K = k 2 C + G where C and G 
are matrices with elements Cy = (ifi, ipj) and Gy = (V(/?i, V(/?j) respectively. 
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4.1.1. Gaussian noise 

In the Gaussian case, when M is Gaussian white noise, the right hand side of 
(10) under the finite element approximation can be shown to be Gaussian with 
mean zero and covariance C. Thus, one has 

w~ N (O.K^CK -1 ) . (14) 

For higher order a/2 G N, the weak solution is obtained recursively. If, for 
example, a = 4 the solution to (re 2 — A) 2 Xq — W is obtained by solving 
(re 2 — A)Xq = X, where X is the solution for the case a = 2. This results in re- 
placing the matrix K with a matrix K Q defined recursively as K Q = KC~ 1 K Q _2, 
where K2 = K. For more details about these representations in the Gaussian 
case, see Lindgren, Rue and Lindstrom (2011). 

So far, we have not specified how the basis functions should be chosen, 
but this choice will determine the quality of the approximation as well as some 
computational properties. If, for example, Daubechies wavelets are used as basis 
functions, the precision matrix (inverse covariance matrix) Q for the weights is 
a sparse matrix (Bolin and Lindgren, 2009), which facilitates the use of efficient 
sparse matrix techniques when using this model. Lindgren, Rue and Lindstrom 
(2011) used piecewise linear basis functions induced by triangulating the do- 
main, and in this case C is a sparse matrix, but its inverse is dense. To obtain 
a sparse precision matrix in this case (which is needed for efficient GMRF com- 
putations), one can approximate C with a diagonal matrix C with elements 
Cjj = Jipi(s)ds. To simplify the notation later, we denote the zth element on 
the diagonal by as it is the area where (pi > tpj for j^i. For more details 
on this approximation and the choice of basis functions, see Bolin and Lindgren 
(2009). 



4-1.2. Laplace noise 

For the Laplace case, one has M = A in the weak formulation (10). Under 
the finite element approximation, the left-hand side can, as in the Gaussian 
case, be written as K Q w. Using Theorem 1 in Aberg and Podgorski (2011), the 
distribution of the right-hand side in the case of Laplace noise is given by the 
characteristic function 

0a(u) = exp J «7y5(s) T u - log ^1 - ifiip(s) T u + ^-(y>(s) T u) 2 ^ ds^j , 

where <p(s) = ((pi (s), . . . , (p n (s)) T . This representation is not very convenient for 
approximation and simulation of the model. Instead we will use a representation 
based on the series expansion (6) of A. However, for a moment, we turn to the 
more general setup of type-G processes to hint at how this technique could be 
applied also for this broader class of random fields. 

Recall that a Levy process is type G if its increments can be represented as a 
Gaussian variance mixture V 1 / 2 Z where Z is a standard Gaussian variable and 
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V is a non-negative infinitely divisible random variable. Clearly, the Laplace 
fields are of type G as their increments are of the form Y 1 I' 1 Z where L is a 
gamma variable. Rosihski (1991) showed that every Levy process of type G can 
be represented as a series expansion similar to the expansion (6) for the Laplace 
fields. This expansion also holds in M d , and for a compact domain D £ M d it 
can be written as 

oo 

M(s) = 5^G fc 0(7fc)*l(s>Sfc), 

where the function g is the generalized inverse of the tail Levy measure for V 
and the other variables are the same as in the Laplace case (6). Since V is infin- 
itely divisible, there exists a non-decreasing Levy process V(s) with increments 
distributed the same as V. This process has the series representation 

oo 

V(s) = ^g( 7k )H(s> Sk ). (15) 

k=l 

Now, consider the integral of some basis function ipi with respect to M , which 
can be represented as 

~ oo 

/ ^(s)Af(ds)^^^(s fc )G feV / 5( : >). (16) 
Jd fe =i 

Thus, the distribution of (f D tpi(s)M(ds), ■ ■ ■ , f D ip n (s)M(ds)) can be approx- 
imated in distribution by taking partial sums of the series in (16). Another way 
of calculating the distribution is to evaluate the integrals by conditioning on the 
variance process V(s) (Wiktorsson, 2002); given that j D <pf(s)V(ds) < oo, the 
integral conditionally on V is simply a Gaussian variable 



Pi (s)M(ds)|F~N 0, / tpt(s)V(ds) 
d \ Jd 

Going back to the case of Laplace noise. If M is a Laplace field corresponding 
to the Laplace measure A, the variance process is a gamma process, L(s), so by 
the argument above one has that the right hand side of (10) under the finite 
element approximation and conditionally on the gamma process is N(m, S), 
where the elements of m and £ are given by 

Sy ;=C Q% 4 (s)A(ds),^^(s)A(ds) Tj =yV;(sV,(s)L(ds), 

m l = E^y pi(s)A(ds) T^j=jJ <^(s)ds + J ^(s)L(ds). 

Given this, the weights w can be calculated conditionally on the gamma process, 
r(s), as 

w|r~N(K- 1 m,K^ 1 SK- 1 ), (17) 
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where K Q is defined recursively as in the Gaussian case. 

It would seem as one has not gained much by using the conditional repres- 
entation since the conditional mean and covariances, rh; and Sy, do not have 
any simple distributions. One way of approximating them is to approximate the 
integrals with respect to the Gamma process using the right hand side of (15) 
with a finite number of terms. However, by using compactly supported linear 
basis functions, one can simplify things further. Thus, now assume that the basis 
functions are piecewise linear functions induced by some triangulation of the do- 
main. One can then perform the same Markov approximation as in the Gaussian 
case. This results in an approximation of the right-hand side of (10) condition- 
ally on the gamma process distributed as IM(m, E) with m = jt& + fiT, and 
£ = diag(r). Here, the gamma variables I\ ~ r(ra,-, 1) are independent and 
&i = J ifi(s) ds, and these can be calculated without numerically estimating the 
integrals with respect to the gamma process. 

Bolin and Lindgren (2009) studies how this approximation affects the res- 
ulting covariance function of the process in the Gaussian case, and it is shown 
that the error is small if the approximation is used for piecewise linear basis 
functions. Although additional studies are needed in the non-Gaussian case, the 
results are likely similar so that the simplification has no large impact on the 
approximation. Figures 2-4 show that the approximation is accurate in one and 
two dimensions as explained in Section 5. 

4-2. The solution for general a > d/2 

If one could approximate the solution to (8) for a = 1, the recursive scheme 
discussed above could be used to represent the solutions for all positive odd a. 
In the Gaussian case, Lindgren, Rue and Lindstrom (2011) use a least-squares 
method where the test functions are chosen as ipi = (k 2 — A) 2 ip i . The left-hand 
side of (10) can then be expressed as Kw and the right-hand side is a mean 
zero Gaussian variable with covariance matrix K. This follows from Lemma 2 in 
Lindgren, Rue and Lindstrom (2011), which shows that the covariance between 
element i and element j on the right-hand side can be written as 



The stochastic weights therefore form a GMRF w ~ N(0,K _1 ). This argu- 
ment is unfortunately not applicable in the non-Gaussian case as the covariance 
between the elements given the gamma process F(s) is 
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We have not been able to find an easy way of evaluating Sjj in the non-Gaussian 
case, and it seems as this least-squares procedure is not extendable to the non- 
Gaussian case. However, if one instead uses ipi = ipi, the right-hand side of (10) 
conditionally on the variance process is N(m, E), as in the case a = 2. With this 
as a starting point, one can use a finite element matrix transfer technique (FE- 
MTT) to obtain a discretized approximation of the solution. Simpson (2008) 
studied such methods for sampling generalized Matern fields on locally planar 
Riemannian manifolds, and argued that one could sample the stochastic weights 
for a general a using the matrix transfer equation (C _1 K) Q / 2 w ~ N(0, C _1 ). 
To simplify the notations in later sections, denote K Q = (C _1 K) a / 2 and note 
that we now have changed the definition of K Q from the one that was used 
for even a. The weights w are then mean zero Gaussian with a precision mat- 
rix Q Q = K«C ^K^. In the case a — 2, this discretization coincides with the 
approximation described above, but it can be used for any a > d/2. 

Now in the non-Gaussian case, the results from the case a = 2 can be used 
directly to get a right-hand side that is Gaussian with mean m and covariance 
£ conditionally on the variance process. As in the Gaussian case, this should 
be multiplied with C _1 to get consistency in the FE-MTT procedure. Hence, 
in the case of Laplace noise the weights are given by 

w|r ~ N (K~ 1 C _1 m, K^C^EC^K- 1 ) . (18) 

Again, for the case a = 2, this coincides with the procedure described in the 
section above, and because of this we will from now on use this FE-MTT proced- 
ure for all a > d/2. Consistency of the FE-MTT procedure follows from similar 
arguments as in Simpson (2008). These arguments do not provide a rate of con- 
vergence as the number of basis functions are increased, and as for the Gaussian 
case, the rate of convergence and the numerical properties of the approximation 
are strongly dependent on a. 

5. Sampling from the model 

Using the finite element representation obtained in the previous section it is easy 
to generate samples from the SPDE (8). Assume that we want sample the model 
at locations s = (si, . . . , s„), and let $ be a matrix with elements = ipj(si). 
Samples can now be generated using the following three-step algorithm. 

Algorithm 5.1. Sampling the Laplace driven SPDE (8). 

1. Generate two independent random vectors T and Z, where Ti ~ r(raj, 1) 
and Zi ~ N(0,1). _ 

2. Let A = 7ra + fiT + diag(v / T)Z and calculate w = C _1 A. 

3. X = ^K^w is now a sample of the random field at the locations s. 

The last step could potentionally be computationally expensive for large sim- 
ulations. However, if a is even, one can take advantage of the sparsity of K a and 
solve the equation system v = K^w efficiently without calculating the inverse 



David Bolin/Spatial Matern fields driven by non-Gaussian noise 



14 



0.8 




-2 2 4 6 8 10 12 14 0.1 0.3 0.5 0.7 0.9 




123456789 



Figure 2. The lower panel shows a simulation of the Laplace driven SPDE (8) on R with 
parameters fj, = 'y = a = X, t = 2, re = 15, and a = 2. The upper left panel shows a histogram 
of the samples from 1000 simulations together with the true density. The upper right panel 
shows the empirical covariance function for the samples (grey curve) together with the true 
Matern covariance function (black curve). It is difficult to see the grey curve since the two 
curves are very similar. 

by using Cholesky factorization and back substitution as suggested by Rue and 
Held (2005). For other a > d/2, K a is not sparse and the Cholesky method will 
not improve the computational efficiency. However, as Simpson (2008) shows, 
one can instead use Krylov subspace methods in the calculations to obtain effi- 
cient sampling schemes. The basic problem for general a is to solve the matrix 
equation v = (C _1 K) _ ^w, and there are a number of methods with different 
computational properties that can be used. In this work we use the method by 
Hale, Higham and Trcfcthcn (2008), which is based on combining contour in- 
tegrals evaluated by the periodic trapezoid rule with conformal maps involving 
Jacobi elliptic functions. 

In Figure 2, a simulation of a process on K with parameters fi = 7 = a = 1, 
t = 2, k = 15, and a = 2 is shown. Since K Q is sparse in this case, the Cholesky 
method is used for the simulation. In the upper left panel, a histogram of the 
samples from 1000 simulations is shown together with the theoretical density, 
calculated using numerical Fourier inversion of the characteristic function (13). 
In the upper right panel, the empirical covariance function of the samples is 
shown together with the theoretical Matern covariance function. Two more ex- 
amples of densities and covariance functions for different parameter settings are 
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Figure 3. Simulation results as in Figure 2 with different parameters. The top row shows a 
symmetric case with parameters /j = 7 = 0, tr = 1, re = r = 10, and ct = 1. The bottom row 
shows an asymmetric case with parameters /i = o = 0.1, 7 = 0, re = 20, r = 10 and a = 3.5. 



shown in Figure 3. In the upper panels, we have a = 1, which results in an 
exponential covariance function. The other parameters are /i = 7 = 0, a = 1, 
and r = « = 10, which results in a symmetric distribution. In the lower panels, 
we have a = 3.5 which results in a smoother field. The other parameters are 
fi = a = 0.1, 7 — 0, t = 10, and n — 20, which results in an asymmetric 
distribution. In both cases in Figure 3, the Krylov subspace method is used for 
the simulations. 

In Figure 4 and Figure 5, two simulations of fields on R 2 are shown to- 
gether with the corresponding covariance functions, densities, and empirically 
estimated versions based on 1000 simulations each. As seen in the figures for 
all five examples, there is a close agreement between the histograms and the 
true densities, and between the true covariance functions and the empirically 
estimated covariance functions for all these parameter settings, indicating that 
the approximation procedure works as intended. A more detailed analysis of the 
simulation procedure is outside the scope of this article, but it should be noted 
that the SPDE approximation using piecewise linear basis functions does not 
provide convergence of higher-order derivatives, and the simulation procedure is 
therefore not appropriate for applications where such properties are important. 
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Figure 4. A simulation of an asymmetric model (8) in R 2 where the parameters are k = 5, 
it = /^ = 7 = 1, t = 2, and a = 2. TVie covariance functions and densities for these fields can 
be seen in the second row. The empirically estimated versions are based on 1000 simulations. 

6. Parameter estimation 

Parameter estimation for Laplace moving average models is not easy since there 
is no closed form expression for the parameter likelihood. Recently, Podgorski 
and Wegener (2011) derived a method of moments-based estimation procedure 
for these types of models. In their method, the convolution kernel is first estim- 
ated from the spectral density of the data, and given the estimated kernel, the 
parameters in the Laplace distribution are estimated by fitting the theoretical 
moments of the Laplace distribution to the sample moments. The method is 
quite simple although some special care has to be taken to handle the cases 
when the method of moments equation system does not have a solution, which 
can happen for certain values of the sample skewness and excess kurtosis. 

Using the SPDE formulation, parameter estimation can instead be performed 
in a likelihood framework. One of the advantages with this is that maximum 
likelihood parameter estimates always are in the allowed parameter space. An- 
other advantage is that the estimates will account for all relevant information 
in the data, which might not be the case for method of moment estimates. 

To be able to estimate the parameters in a maximum likelihood framework, 
the problem is interpreted as a missing data problem which facilitates use of the 
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Figure 5. A simulation of a symmetric model (8) in R 2 with parameters n = 5, <r = 1, 
fi = 7 = 0, t — 2, and a = 4. TTie covariance function and density are shown in the second 
row. The empirically estimated versions are based on 1000 simulations. 



Expectation Maximization (EM) algorithm (Dempster, Laird and Rubin, 1977). 
The proposed EM algorithm is based on the same ideas as the ones in Lange, 
Little and Taylor (1989) and Protassov (2004) which looked at EM estimation in 
the case of iid observations of certain Gaussian mixtures. Our main contribution 
is the extension of these ideas to the random field setting. 

Assume we have measurements X of the process X(s) taken at some loca- 
tions and that the Hilbert space approximation procedure is used with a basis 
obtained by triangulating the measurement locations. In this case, the matrix 
«& is diagonal and conditioning on the measurements and the parameters is 
equivalent to conditioning on A and the parameters as there is a one-to-one 
correspondence between the two through A = CK a $ _1 X, see Algorithm 5.1. 
To obtain simpler updating expressions, we first make a change of variables by 
introducing the parameter 7 = 7T and estimate this parameter instead of 7. 
As for Gaussian Matern models, the shape parameter v is difficult to estimate 
accurately and it is therefore assumed to be known throughout this section and 
no attempt is made at estimating it. 

Augmenting the data with the unknown (missing) gamma variables, the aug- 
mented likelihood is L(0|X,r) = 7r(X|r, &)-k{T\Q\ and the loss-function that 
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is needed for the EM-procedure is 

(2(6,0®) = E flog L(0|X,r) X,0«), 



where 6® is an estimate of 6 = (k, a, [i, 7, r) at iteration j, and the expectation 
is taken according to the distribution of T given X. We have X|T, ~ N(m, er 2 S), 
where m = ^K^C^a+^r), S = ^K^CDrCK" 1 *, and D r is the diagonal 
matrix with the vector T on the main diagonal. The second part of the augmen- 
ted likelihood can be written as tv(T\0) = J\tt (Ti\0) since the components in T 
are independent gamma variables 1^ with shape parameters ra^ and scale one, 
where a^ are known constants depending on the basis used. The log-likelihood 
is 

logL(0|X, r) = - nlog(a) + log(|K Q |) - ^(X - m^ST^X - m) 

n 

+ Oi log - log r(ra,)) + C, 

»=i 

where the constant C does not depend on the unknown parameters. Thus, using 
the relation between X and A, the loss-function is 

Q(0, 0®) = - nlog(<r) + log(|K„|) - 5^2 ((A - 7 a) T D E(r - lw (A - 7 a) 
+ ^ 2 l T E(r|*) + 2 7 ^a T l - 2/zA T l 



+ (raiEClogr^) - lo g r(ra 4 )) + C, 

i=l 

where E(-|*) denotes E(-|0W,X). The expectations needed to evaluate the loss- 
function are E(T|*), E(r _1 |*), and E(logFj|*). To calculate these, first note that 
(see Gradshteyn and Ryzhik, 2000, formula 3.472.9) 

X(a,b,c)=J x a ~ 1 e-^- cx dx = 2[-j K a [2VbPj . (19) 
Using this expression, the expectation E(rj|*) can be written as 

E ( r iW . /r 1 ^r 1 |x,«dr i . /MS^pW 

_ /r^(x|r,, oy{Ti\o) dr, _ ^{ra i + 1 , 2<T , ,i + £t) 



I T(x|r<, 0MTi\o) dr, x ^ _ i ^-^)\ i + 



So 3 



A, : -78*1 K r^+h\ !T 2 |Ai-7a i |v / 2a 5 + A* 2 



\/ 2CT2 + M 2 ^ ai _| ( < 7- 2 |A i -7a i |V2<7 2 +// : 
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If the argument in the Bessel functions is very small or very large one might 
get numerical problems when evaluating this expression depending on how it is 
implemented. In the case of small arguments, one can use the following approx- 
imation to improve the numerical stability 

K a {x) w {^j ' ' , if a £ and x « y/\afFl. 

The expectation E(r,|*) then simplifies to 



l\ 2a T > 1 



In the case of large arguments, one can instead use the approximation 

ir«-i(z) V 2; x ' 

which gives the following approximation for E(rj|*) 



E(Ti|*) 



V2ct 2 + m 2 2(7 2 + ^ 2 



The expectation E(r i is calculated similarly using (19) and can be writ- 
ten as 



_! . V2t 2 +M 2 ^-§ ( CT 2 l A i-7ai|^ 



2 J- /i 2 



E(rr» = v ^ ±_± ( . (20) 

IAi-7^1 K ( CT -2| Ai „^|V2a 2 + M 2 



Evaluating modified Bessel functions numerically is computationally expensive 
and should therefore be avoided as much as possible when implementing the 
estimation procedure. To that end, one can express K TgL _3 (•) using the following 
recurrence relationship for modified Bessel functions 

K a (x) = K a+2 (x) - ^±^-K a+1 (x), 

x 

giving the following expression for E(r" 1 |*) in terms of E(r^) 

PfT -i U s (M 2 + 2a 2 )E(I»-g 2 (2Ta. t -l) 
( * l '~ (A,-7a 4 ) 2 

Using this expression instead of (20), one only has to evaluate two modified 
Bessel functions instead of three. 

Finally, the expectation E(log(rj)|*) is similarly written as 

E(iog ( row = ^ j) ^ (X|ri ' 0)7r(r ^ )dri 



7r(X|0) 
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The denominator is the same as in the previous expectations, while calculating 
the nominator requires evaluating an integral on the form 

Xlog(a, 6, c) = J log(x)a; a_:L exp - - cx^j dx. (21) 

To calculate this integral, we differentiate (19) with respect to a and obtain 

W«,M = ^"V'.-S-d* = A (2 (0* Ka ( 2 ^)j 

= 2(5) S (lo g (^„( 2 v^) + A K „( 2 ^ 



The derivative of K a (2\bc) with respect to a can be expressed using infinite 
sums of gamma- and polygamma functions; however, in this case it is easier to 
numerically approximate the derivative using for example forward differences: 

d , ^ K a+£ foy/te) - K a hy/bcj 
—K a hVbc) ~ ^ '- ^ 



da v j e 
Using this expression, we approximate E(log(rj)|*) as 

T, C™ • - 1 ( A .-7 a i) 2 1 1 jl 
Mog\TA % 2 , ' 1+ 2cr 

E(iog(r 4 )|*) - 




1 (A,-7a,)^ 1 j£ 
2 ' ' T 2r 



(cr- 2 |Ai -7ai|v/2CT 2 
o- 2 |A 4 -7^1^20-2 +/1 2 

To obtain the updating equations for the parameters, the loss- function should 
be maximized with respect to each parameter, for example by differentiating it 
with respect to the parameters and setting the derivatives equal to zero. Since 
the system of equations obtained from this procedure is not analytically solv- 
able, one would have to iterate numerically in each step to obtain the parameter 
updates if the EM algorithm is used without modifications. A better alternative 
is to use an Expectation Conditional Maximization (ECM) algorithm (Meng 
and Rubin, 1993) where the M-step is divided into two conditional maximiz- 
ation steps. In the first step, the parameters of the Laplace noise is updated 
conditionally on the current value of k, and in the second step n is updated 
conditionally on the other parameters. Differentiating the loss-function with re- 
spect to (i, 7, and a and setting the derivatives equal to zero yields the following 
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updating rules 



(A T l)(a T D E(r - lk) a) - (a T l)(A T D E(r - lw a) 



7 



(l T E(r|*))(aTD E(r - lw a)-(l T a)2 

wll , (l T E(rK))(A T D E(r - 1H a) - (A T l)(a T l) 
(l T E(rK))(aTD E(r - lw a)-(l T a)2 




W (l T E(rK))(aTD E(r - 1| , ) a)-(l T a)2 

(A T D E(r - 1| , ) a) 2 (l T E(rK)) + (a T D E(r - lk) a)(A T l) 2 y 
(l T E(r|*))(aTD E(r - lw a)-(l T a)2 J 

In general, there is no closed form expression for the conditional updating equa- 
tion for r, so the following equation is maximized numerically to obtain t^ +1 ' 

n 

Q T = E (ra<E(logI\|*) - logl^a,)) . 

i=l 

In the special case when all a^ are equal to some value a, which for example 
is the case if a triangulation induced by a regular lattice is used in the Hilbert 
space approximation, the solution can be written as 

r^) = V ^EE(logr iW ), 

where is the inverse of the digamma function. Finally k is updated con- 

ditionally on the other parameters. There is no closed form expression for the 
updating equation for n either, so the following expression is maximized numer- 
ically with respect to k, 

Q K =log(|K a |) - ^-JL_ (A T D E(r - lw A 

- 27^' +1 >A T D E(r - lw a-2^ +1 )A T l) . 
By the construction of K a , its log-determinant can be written as 

n 

log(|K Q |) = ~ log IC"^ + k 2 I\ = | E lo s( A « + " 2 )> 



2 

1=1 



where denotes the ith eigenvalue of C _1 G. If the size of K Q is small, these 
eigenvalues can be pre-calculated as they do not depend on the parameters. 
For larger problems is it most efficient to calculate the log-determinant in each 
iteration using a sparse Cholesky factorization of K = G + k 2 C. 

As shown by Mcng and Rubin (1993), the ECM algorithm has the same con- 
vergence properties as the ordinary EM algorithm. The likelihood is increasing 
for each iteration and the convergence is linear. Hence, we do not lose any rate 
of convergence by using the ECM algorithm instead of the EM algorithm. 
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7. A simulation study 

In this section, a simulation study is performed to test the accuracy of the para- 
meter estimation algorithm presented above. The algorithm is tested for twelve 
different parameter settings corresponding to marginal distributions shown in 
Figure 6 for processes in one dimension with a = 2. For Matern covariance func- 
tions, one sometimes defines the approximate range as r = y/SuK^ 1 , which is 
the value where the correlation is approximately 0.1. For the first six test cases, 
we have k = 1 which corresponds to an approximate range of 3.5, and for the 
last six cases we have k = 0.1 which corresponds to an approximate range of 35. 
For each value of k, three symmetric distributions and three asymmetric distri- 
butions are used. In Figure 6, the distributions for the short range are shown in 
the two upper panels, and the distributions for the long range are shown in the 
two bottom panels. 

For each set of parameters, 500 data sets are simulated using Algorithm 5.1, 
where each data set contains 1000 equally spaced observations on [1, 1000]. The 
basis used in the Hilbert space approximations consists of 1000 piecewise linear 
basis functions centered at 1,2,..., 1000. For each data set, the starting value 
for k is set to y/Svr^ 1 , where r is the approximate range for the empirical 
covariance function for the data set. To obtain good starting values for the 
other parameters, an initial run of the EM estimator is made with k fixed to the 
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Table 1 

Parameter settings for the twelve cases shown in Figure 6 the estimation procedure is tested 
for. In the parentheses, the 10% 50%, and 90% percentiles of 500 Monte Carlo samples are 
shown. Note that most estimates seem to be unbiased, perhaps with the exception of the 

estimates of r in case L. 



starting value and where the starting values for \x and 7 are drawn independently 
from a N(0, 1) distribution, and the starting values for a and r _1 are drawn from 
a x 2 (l)-distribution. After 100 steps, this initial run is ended, and the estimates 
are used as starting values for the full EM-estimator. 

In Table 1, the 10% 50%, and 90% percentiles of 500 Monte Carlo samples 
are shown for each parameter setting, together with the true values of the para- 
meters. One can note that all estimates are more or less unbiased and have 
fairly small variances, indicating that the estimation procedure works as in- 
tended. The only case where the estimator seems to have a bias is in case L, 
where most of the estimated values of r are above the true value. The cause of 
this bias is probably that the estimation procedure is not very stable for small 
values of r because some of the expectations E(r^ |*) can be infinite in this 
case. More precisely, for r < 3/(2min(ai, . . . ,a„)), the likelihood is unbounded 
for any 7 = Aj/aj and the ML procedure thus has to be modified. To improve 
the stability of the algorithm, the expectations E(r~ 1 |*) are truncated to 1000 
in the first iteration, and for each iteration this bound is made larger so that 
it has little to no effect after a few hundred iterations of the algorithm. This 
greatly improves the stability for t < 3/(2 min(ai, . . . , a„)), but it is left for 
future research to justify this modified maximum likelihood procedure theor- 
etically, to derive large sample properties of the estimator, and to investigate 
other improvements for the case of small values of r. 

It should finally be noted that the parameters are estimated assuming the 
same finite element approximation as is used for simulating the data. Estim- 
ating the model parameters using a different numbers of basis functions in the 
approximation can possibly give biased estimates, as the parameters are estim- 
ated to maximize the likelihood for the approximate model instead of the exact 
SPDE. The size of this bias depends on the specific parameters of the model, 
and especially on the true covariance range in relation to the spacing of the 
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basis functions, as discussed in Bolin and Lindgrcn (2011) in the case of Gaus- 
sian models. It is, however, outside the scope of this work to investigate this 
issue further here. 

8. Discussion and extensions 

We have showed how the SPDE approach by Lindgren, Rue and Lindstrdm 
(2011) can be extended to the case of Laplace noise and how this can be used 
to obtain an efficient estimation procedure as well as an accurate estimation 
technique for the Laplace moving average models. This is indented as a demon- 
stration that the methods in Lindgren, Rue and Lindstrdm (2011) are applicable 
to more general situations than the ordinary Gaussian models. There are also a 
number of extensions that can be made to this work which are discussed below. 

First of all, the Hilbert space approximation technique in Section 4 was de- 
rived using theory for Levy processes of type G, and although we only used 
this for the case of Laplace noise, the methods work equally well for this lar- 
ger class of models. All that is changed are the distributions of the integrals 
conditionally on the variance process. These techniques are also applicable to 
the case when more general SPDEs are used, one could for example use the 
nested SPDEs by Bolin and Lindgren (2011) to achieve more general covariance 
structures without any additional work needed, or one could include drift terms 
in the operator on the left-hand side to mimic the effects of asymmetric kernels 
in the Laplace moving average models. The methods are in fact not restricted 
to R d or stationary SPDEs, but can be extended to non-stationary SPDEs on 
general Riemann manifolds. 

Secondly, the estimation procedure in Section 6 assumed that one basis func- 
tion was used for each observation of the process. The reason being that this 
gives us a one-to-one correspondence between the observations and the Laplace 
variables A which simplified the estimation procedure. For practical applications 
this is not ideal as one would like to be able to choose the basis independently 
of the measurement locations, and it would also be useful if one could assume 
that the measurements are taken under measurement noise. If the estimation 
procedure could be extended to handle these cases, the practical usefulness of 
these models would greatly improve. 

As mentioned in Section 7, the estimation procedure is sensitive to the value 
of t. Too large values will result in a model which is very similar to a standard 
Gaussian model, and it might be difficult to accurately estimate the parameters 
in this case without a very large data set. This is not a big problem as if the 
data is Gaussian, one should not use these models but a standard Gaussian 
model. The estimation procedure is also unstable for small values of r, and 
modifications to further improve the stability in this case are currently being 
investigated. 
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